A Mets Based Information Package For Long Term Accessibility Of Web Archives

نویسنده

  • Markus Enders
چکیده

The British Library’s web archive comprises several terabyte of harvested websites. Like other content streams this data should be ingested into the library’s central preservation repository. The repository requires a standardized Submissionand Archival Information Package. Harvested Websites are stored in Archival Information Packages (AIP). Each AIP is described by a METS file. Operational metadata for resource discovery as well as archival metadata are normalized and embedded in the METS descriptor using common metadata profiles such as PREMIS and MODS. The British Library’s METS profile for web archiving considers dissemination and preservation use cases ensuring the authenticity of data. The underlying complex content model disaggregates websites into web pages, associated objects and their actual digital manifestations. The additional abstract layer ensures accessibility over the long term and the ability to carry out preservation actions such as migrations. The library wide preservation policies and principles become applicable to web content as well.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Turning pure Web Page Storages into Living Web Archives

Web content plays an increasingly important role in the knowledge-based society, and the preservation and long-term accessibility of Web history has high value (e.g., for scholarly studies, market analyses, intellectual property disputes, etc.). There is strongly growing interest in its preservation by libraries and archival organizations as well as emerging industrial services. Web content cha...

متن کامل

Terminology Evolution in Web Archiving: Open Issues

The correspondence between the terminology used for querying and the one used in content objects to be retrieved, is a crucial prerequisite for effective retrieval technology. However, as terminology is evolving over time, a growing gap opens up between older documents in (long-term) archives and the active language used for querying such archives. Thus, technologies for detecting and systemati...

متن کامل

ارزیابی اوپک‎های مبتنی بر وب کتابخانه‎های مرکزی دانشگاه‎های دولتی شهر تهران

In this article, nine web based OPACs of state universities’ central libraries in Tehran were evaluated. For this purpose, first a list criteria for the evaluation of OPACs was drawn from the literature and an evaluation checklist was prepared. Then OPACs were evaluated using the checklist. The results showed the need for some improvement in the design of the OPACs. In terms of accessibility an...

متن کامل

The Utilization of Web-based Continuing Medical Education Courses in Mashhad University of Medical Sciences and its Relationship with Course Characteristics

Introduction: One of the educational methods which can overcome time and distance limitations is electronic learning. Healthcare professionals, facing with such limitations, also need continuing education to keep their information up to date. This study was conducted to evaluate the utilization of electronic courses by the medical professionals in Mashhad University of Medical Sciences and its ...

متن کامل

First Results on Detecting Term Evolutions∗

ABSTRACT The archival of content like publications or web pages is just the first step toward “full” content preservation. It also has to be guaranteed that content can be found and interpreted in the long run. The correspondence between the terminology used for querying and the one used in content objects to be retrieved, is a crucial prerequisite for effective retrieval technology. However, a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010